Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
🔀 SIMD Programming
Specific
Vectorization, Parallel Computing, CPU Instructions, Performance
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
160318
posts in
13.8
ms
Beating Python’s GIL: Achieving a 130x
Speedup
in Batch Processing with Rust and
Rayon
🦀
MIR Optimization
medium.com
·
2d
·
…
APL
Performance
🔗
Linear Lisp
aplwiki.com
·
3d
·
Hacker News
·
…
facebookincubator/dispenso
: The project provides high-performance concurrency, enabling highly parallel computation.
⏱️
Async Runtimes
github.com
·
18h
·
Hacker News
·
…
Building
CompilerSutra
🎓
Teaching Compilers
docs.google.com
·
18h
·
DEV
·
…
Metal Quantized Attention: pulling M5 Max ahead with
Int8
matrix
multiplication
🗺️
Region Inference
releases.drawthings.ai
·
1d
·
Hacker News
·
…
Accelerate CPU-based AI inference workloads using Intel
AMX
on Amazon
EC2
🗺️
Region Inference
aws.amazon.com
·
3d
·
…
Supercharging
Redpanda
Streaming with profile-guided optimization
📈
Performance Tools
redpanda.com
·
22h
·
…
Intel
Binary
Optimization Tool Changes Code Execution with Heavy
Vectorization
🎯
CPU Dispatch
techpowerup.com
·
2d
·
…
'Performance without compromise': AMD debuts first dual 3D V-Cache Ryzen CPU in potential showdown against
Threadripper
and
EPYC
siblings
🎯
CPU Dispatch
techradar.com
·
2d
·
…
Iteratively
optimizing an
SPSC
queue
🎯
Ring Buffers
blog.c21-mac.com
·
4d
·
r/cpp
·
…
Building a
Production-Grade
Vector Database in Rust: What We
Shipped
🚂
Cranelift Backend
ferres.io
·
1d
·
DEV
·
…
MXFP8
GEMM: Up to 99% of
cuBLAS
Performance Using CUDA and PTX
🔬
Nanopasses
danielvegamyhre.github.io
·
4d
·
Hacker News
·
…
Intel Delivers Open, Scalable AI Performance in
MLPerf
Inference
v6.0
🗺️
Region Inference
newsroom.intel.com
·
1d
·
…
JetStream
3: A modern benchmark for high-performance,
compute-intensive
Web applications
⚡
Performance
blog.chromium.org
·
2d
·
Hacker News
,
Blogger
·
…
Why I’m Building a
Database
Engine in C#
🗃️
Query Compilation
nockawa.github.io
·
5d
·
Hacker News
·
…
Finding performance
bottlenecks
with
Pyroscope
and Alloy: An example using TON blockchain
🔗
Hash Algorithms
grafana.com
·
3d
·
…
abdimoallim/psimd
: A portable, header-only SIMD library for C (SSE2, SSE4.1, AVX/AVX2+FMA, NEON/AArch64, WebAssembly
SIMD128
, scalar fallback)
🔍
Peephole Optimization
github.com
·
1d
·
r/C_Programming
·
…
[Benchmark]
740k
QPS
Single-thread / 1.45M Dual-thread on a VM. Encountering fluctuations and seeking expert analysis.
🌐
WASM Runtimes
github.com
·
1d
·
r/java
·
…
m0at/rvllm
:
rvLLM
: High-performance LLM inference in Rust. Drop-in vLLM replacement.
🦀
MIR Optimization
github.com
·
5d
·
Hacker News
·
…
yash27-lab/batch
_forge: A high-performance, bare-metal inference engine for JAX and Equinox models written in Rust. Features zero-copy
Safetensors
loading and hand-optimized Metal/Vulkan compute kernels for Transformers, Vision Language Models, and State-Space Models
🗺️
Region Inference
github.com
·
3d
·
Hacker News
·
…
Loading...
Loading more...
Page 2 »
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help